Skip to content

feat(experts): kb.experts — rank entities by evidence density on a topic (closes #315)#347

Open
e11734937-beep wants to merge 2 commits into
vouchdev:mainfrom
e11734937-beep:feat/vouch-experts
Open

feat(experts): kb.experts — rank entities by evidence density on a topic (closes #315)#347
e11734937-beep wants to merge 2 commits into
vouchdev:mainfrom
e11734937-beep:feat/vouch-experts

Conversation

@e11734937-beep

@e11734937-beep e11734937-beep commented Jul 3, 2026

Copy link
Copy Markdown

Summary

Closes #315. Adds kb.experts — a read-only query that answers "who/what does the kb know about X." Given a free-text topic, it ranks the entities carrying the most matched evidence, so an agent can pull the right write-ups or ask the right follow-up.

Pure read: it aggregates approved, live claims and returns a ranking — no propose_*, no approve, no writes, no network, no LLM. The review gate is untouched by construction.

Surface

  • cli: vouch experts "<topic>" [--limit N] [--min-claims N] [--weight count|recency|citation] [--json]
  • method: kb.experts(topic, limit=10, min_claims=1, weight="count"), wired at all four registration sites — server.py (kb_experts MCP tool), jsonl_server.py (_h_experts + HANDLERS), capabilities.py (METHODS), cli.py (vouch experts). test_capabilities stays green.
  • matching: FTS hits on the topic (index_db.search) plus the substring pass on entity name/aliases (salience._substring_entity_ids); aggregate the entities referenced by the matched claims.
  • weights: count (matched-claim count), recency (half-life decay on last_confirmed_at/updated_at), citation (distinct evidence ids × confidence). An unknown weight falls back to count (never raises).
  • result: [{entity_id, name, type, claim_count, citation_count, score, top_claim_ids}], ordered by descending score with a stable tie-break on entity_id.

Scope & correctness

Ranking/status logic lives in a new dedicated src/vouch/experts.py (not recall.py, not storage.py which stays pure I/O). Claims with status superseded/archived/redacted are excluded so a non-live claim never inflates a score (issue #78). Runs entirely against the local .vouch/ kb; zero network, zero LLM.

Tests

tests/test_experts.py covers each weight mode, the status-exclusion filter, min_claims/limit, unknown-weight fallback, empty-kb / no-match, and the deterministic tie-break. tests/test_capabilities.py stays green with kb.experts present at all sites.

Verification

  • python -m ruff check src tests — clean
  • python -m mypy src — clean (0 errors, 80 files)
  • python -m pytest — full suite green (7 new tests; no regressions)

Summary by CodeRabbit

  • New Features

    • Added expert-ranking support across the app, including a new command, server tool, and request method.
    • Results can now be filtered and sorted by evidence-based scoring, with options for limits, minimum matches, and different ranking weights.
    • Added JSON output for machine-readable results.
  • Tests

    • Added coverage for ranking behavior, filtering, empty results, tie-breaking, and fallback handling.

Closes vouchdev#315.

Add a read-only kb.experts query: given a free-text topic, rank the entities
carrying the most matched evidence (count / recency / citation weightings)
identically across mcp / jsonl / cli. Aggregates approved, live claims only —
excludes superseded/archived/redacted so a non-live claim never inflates a
score; no proposals, writes, network, or llm. Ranking lives in a new
src/vouch/experts.py, wired at the four registration sites.
@github-actions github-actions Bot added cli command line interface mcp mcp, jsonl, and http surfaces tests tests and fixtures size: M 200-499 changed non-doc lines labels Jul 3, 2026
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@e11734937-beep, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 59 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 0ca023cb-48a3-4a26-ad26-6a4c1f3cee97

📥 Commits

Reviewing files that changed from the base of the PR and between 3abe49e and 81b2b55.

📒 Files selected for processing (1)
  • tests/test_experts.py
📝 Walkthrough

Walkthrough

Adds a new read-only kb.experts capability that ranks entities by evidence density for a given topic. Implements rank_experts in a new module, wires it into the MCP tool, JSONL handler, CLI command, and capabilities list, and adds corresponding tests.

Changes

kb.experts ranking feature

Layer / File(s) Summary
Core ranking logic
src/vouch/experts.py
New rank_experts function collects candidate claims via FTS and entity/alias matching, excludes superseded/archived/redacted claims, scores entities by count/recency/citation weight, filters by min_claims, and returns a deterministically ordered ranked list.
MCP, JSONL, CLI, and capabilities wiring
src/vouch/server.py, src/vouch/jsonl_server.py, src/vouch/cli.py, src/vouch/capabilities.py
Adds kb_experts MCP tool, _h_experts JSONL handler registered under HANDLERS["kb.experts"], a new vouch experts CLI command with limit/min-claims/weight/json options, and adds "kb.experts" to the METHODS capability list.
Ranking tests
tests/test_experts.py
New test suite covering count/citation weighting, min_claims/limit behavior, status exclusion, weight fallback, empty/no-match handling, and tie-breaking.

Estimated code review effort: 2 (Simple) | ~15 minutes

Sequence Diagram(s)

sequenceDiagram
  participant CLI as vouch experts
  participant MCP as kb_experts tool
  participant JSONL as _h_experts handler
  participant Experts as rank_experts
  participant Store as KBStore

  CLI->>Experts: rank_experts(store, topic, limit, min_claims, weight)
  MCP->>Experts: rank_experts(store, topic, limit, min_claims, weight)
  JSONL->>Experts: rank_experts(store, topic, limit, min_claims, weight)
  Experts->>Store: query matching claims/entities
  Store-->>Experts: matched claims and entities
  Experts-->>CLI: ranked expert rows
  Experts-->>MCP: ranked expert rows
  Experts-->>JSONL: ranked expert rows
Loading

Related issues: #315

Suggested labels: enhancement, feature

Suggested reviewers: vouchdev-maintainers

🐰 A topic whispered, and experts appear,
ranked by claims and citations held dear,
through cli, mcp, and jsonl paths they flow,
superseded voices excluded from the show,
a tidy hop for evidence far and near.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly names the new kb.experts feature and its ranking purpose.
Linked Issues check ✅ Passed The PR appears to implement #315 across MCP, JSONL, CLI, capabilities, and tests, including matching, weights, filters, exclusions, and stable ordering.
Out of Scope Changes check ✅ Passed The diff is focused on the kb.experts feature and its tests, with no clear unrelated changes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/vouch/experts.py (1)

17-17: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Reaching into a private helper across module boundaries.

_substring_entity_ids is underscore-prefixed, signaling it's internal to salience. Importing it directly from experts.py couples the two modules to an unstable private API. Consider exposing a public wrapper (e.g. substring_entity_ids) in salience.py if this matching logic is meant to be reused.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/vouch/experts.py` at line 17, The import in experts.py is reaching into
the private helper _substring_entity_ids from salience.py, which couples modules
to an internal API. Expose a public wrapper or renamed public function in
salience.py, such as substring_entity_ids, and update experts.py to import and
use that public symbol instead of the underscore-prefixed helper. Keep the
matching logic in salience.py and route any cross-module reuse through the
public entry point.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_experts.py`:
- Around line 1-126: `tests/test_experts.py` only exercises `rank_experts`
directly; add coverage for the `kb.experts` JSONL request/response envelope so
both success and failure cases are validated. Introduce tests around the
`kb.experts` entrypoint that assert a request yields `{id, ok, result}` on
success and `{id, ok: false, error}` on failure, using the existing
`rank_experts`/`KBStore` setup to keep the assertions aligned with the current
ranking behavior.

---

Nitpick comments:
In `@src/vouch/experts.py`:
- Line 17: The import in experts.py is reaching into the private helper
_substring_entity_ids from salience.py, which couples modules to an internal
API. Expose a public wrapper or renamed public function in salience.py, such as
substring_entity_ids, and update experts.py to import and use that public symbol
instead of the underscore-prefixed helper. Keep the matching logic in
salience.py and route any cross-module reuse through the public entry point.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d8578c4d-65a1-4ccc-8e2e-e2065cfcd901

📥 Commits

Reviewing files that changed from the base of the PR and between 5f58c69 and 3abe49e.

📒 Files selected for processing (6)
  • src/vouch/capabilities.py
  • src/vouch/cli.py
  • src/vouch/experts.py
  • src/vouch/jsonl_server.py
  • src/vouch/server.py
  • tests/test_experts.py

Comment thread tests/test_experts.py
The suite exercised rank_experts() directly but not the kb.experts JSONL
entrypoint. Add two envelope tests mirroring tests/test_jsonl_server.py:
a well-formed request returns {id, ok, result} with the ranking under
result["experts"], and a request missing the required `topic` param returns
the {id, ok: false, error} failure envelope (code "missing_param").

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cli command line interface mcp mcp, jsonl, and http surfaces size: M 200-499 changed non-doc lines tests tests and fixtures

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: kb.experts — rank entities by evidence density on a topic

1 participant